The IBM conversational telephony system for financial applications
نویسندگان
چکیده
We describe our development work on a telephonebased conversational system in the domain of mutual fund transactions. This system uses several components including robust large vocabulary continuous speech recognition, natural language understanding, dialog management, and text-to-speech synthesis technologies.
منابع مشابه
Semi-Supervised Model Training for Unbounded Conversational Speech Recognition
For conversational large-vocabulary continuous speech recognition (LVCSR) tasks, up to about two thousand hours of audio is commonly used to train state of the art models. Collection of labeled conversational audio however, is prohibitively expensive, laborious and error-prone. Furthermore, academic corpora like Fisher English (2004) or Switchboard (1992) are inadequate to train models with suf...
متن کاملThe 3G-324M Protocol for Conversational Video Telephony
third-generation (3G) networks, conversational video-telephony services are becoming a key differentiator between new 3G offerings and existing 2G/2.5G services. Although it’s possible to have limited video-based services—such as a multimedia messaging service—that deliver pictures and video clips over 2.5G services, these are delay-insensitive applications that could run over a packet-based wi...
متن کاملConversational quality estimation model for wideband IP-telephony services
As broadband and high-speed IP networks spread, IP-telephony services have become a popular speech communication application over IP networks. Recently, the speech quality of IP-telephony services has become close to that of conventional PSTN services. To provide better speech quality to users, speech communication with wider bandwidth (e.g., 7 kHz) is one of the most promising applications. To...
متن کاملUsing Random Forest Language Cts System
One of the challenges in large vocabulary speech recognition is the availability of large amounts of data for training language models. In most state-of-the-art speech recognition systems, -gram models with Kneser-Ney smoothing still prevail due to their simplicity and effectiveness. In this paper, we study the performance of a new language model, the random forest language model, in the IBM co...
متن کاملMulti-lingual and Multi-modal Speech Processing and Applications
Over the last decade voice technologies for telephony and embedded solutions became much more mature, resulting in applications providing mobile access to digital information from anywhere. Both a growing demand for voice driven applications in many languages and the need for improved usability and user experience now drives the exploration of multi-lingual speech processing techniques for reco...
متن کامل